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We prove an exponential approximation for the law of approx¬ 
imate occurrence of typical patterns for a class of Gibssian sources 
on the lattice iP, d>2. From this result, we deduce a law of large 
numbers and a large deviation result for the waiting time of distorted 
patterns. 

1. Introduction. In recent years there has been growing interest in a de¬ 
tailed probabilistic analysis of pattern matching and approximate pattern 
matching. For example, in information theory, motivation comes from study¬ 
ing performance of idealized LempeRZiv coding schemes. In mathematical 
biology one likes to have accurate estimates for the probability that two 
(e.g., DNA) sequences agree in a large interval with some error-percentage. 
There is also considerable interest in the analysis of occurrence of patterns 
in the multi-dimensional setting, for example, in the context of video-image 
compression [2], and more generally, lossy data compression [5, 6, 10]. 

In this paper we study the following problem. Fix a pattern An in a cubic 
box of size n. Given a configuration u of a Gibbs random field, what is the 
size of the “observation window” in which we do not necessarily see exactly 
this pattern for the first time, but any pattern obtained by distortion of the 
fixed pattern An? By this, we mean a pattern which contains a fixed fraction 
e of spins different from those of An- We are interested in the behavior of 
the volume of this observation window, which we call “approximate hitting- 
time,” when n grows. 

Our main result (Theorem 2.6) can be phrased as follows. The distribu¬ 
tion of the approximate hitting-time, when properly normalized, gets closer 
and closer to an exponential law. The normalization is the product of a cer¬ 
tain parameter A„ and the probability of the set of distorted patterns 
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In fact, we get a precise control of the error term which allows us to derive 
two corollaries for the “approximate waiting-time”: given a conhguration rj 
randomly chosen from an ergodic Gibbs random field, we increase the ob¬ 
servation window in a random configuration a drawn from the given Gibbs 
random field until we see approximately the pattern rjc^ ■ The hrst corollary 
implies a law of large numbers allowing to get the rate-distortion function 
almost surely from this approximate waiting-time. The second corollary is 
related to large deviation bounds. While the law of large numbers for ap¬ 
proximate waiting-times appears in [6] (under different conditions), the large 
deviation result is new. We emphasize that Theorem 2.6 is a new result in 
the context of approximate pattern-matching. 

We briefly indicate the key ingredients needed to prove this exponen¬ 
tial approximation. First, we assume that the Gibbs random field satishes 
a certain strong mixing condition (nonuniformly (/^-mixing condition). For 
instance, this property holds for all Markov random fields which satisfy the 
Dobrushin uniqueness condition. The second key ingredient is a result by 
Chi [4] allowing one to obtain the rate distortion function “a la Shannon- 
McMillan-Breiman.” We take advantage of our previous work [1] in which 
we deal with “exact” hitting-times. The proof of the main result of the 
present work readily follows a large part of the proof in [1], but there is 
a crucial step which is different (second moment estimate). Moreover, one 
has to restrict to “good” patterns: if a pattern has “too much overlap” with 
its translates by vectors of size of order n, then one cannot hope to obtain 
an exponential distribution. These good patterns are shown to be typical in 
the sense that their measure approaches one exponentially fast as n diverges 
(Proposition 2.7). When we have a random field distributed according to a 
Bernoulli measure, the goodness assumption on patterns can be removed. In 
this case, we prove (Theorem 2.8) that for any pattern Theorem 2.6 applies. 
Surprisingly, our proof involves the strong invariance principle for simple 
random walks. We have no idea how to provide a simpler proof. 

Outline of the paper. In the next section we set notation and definitions 
and state our main theorems. In Section 3 we apply the exponential approx¬ 
imation of the previous section to approximate waiting times for which we 
obtain a.s. strong approximation and large deviations results. In Section 4 
we state our proofs. 

2. Set-up and main results. For the sake of simplicity, we consider a 
{0, l}-valued random field on the lattice d>2. Our results hold for any 
finite alphabet as well. Configurations are denoted ri,a,uj and collected in 
the set n = {0, 12 is provided with the Borel a-held, and for V C Z'^, 

Tv denotes the cr-algebra generated by {ax :x gV}. 
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For a finite subset F C and configurations cj, G we denote by 

(2.1) A(F,T7,cr) = ^ - cra;| 

xev 

the number of mismatches between a and rj in the volume V, that is, the 
Hamming distance between rjv and cry. 

We denote by Cn the n-cube [0,n]'^ n An n-pattern is a map 
An:Cn^{0,l}. It is naturally associated to its cylinder [An] = 
{fj G : acn = An}. For a pattern An and x G we denote by O^^An the 
pattern supported on -|- x dehned by An{y + x) = An{y) {y G Cn). 

We let \An\^ denote the set of configurations which e-match with 

(2.2) [AnY = G H : A(C'n, w, An) < ejCnl}. 

The set of conhgurations [AnY can also be viewed as a set of n-patterns. 
With a slight abuse of notation, we will use the same symbol for the set of 
configurations and the set of n-patterns which are restrictions of configura¬ 
tions in [AnY lo ^n. 

Definition 2.1. The approximate hitting-time of in a configura¬ 
tion a is defined as 

T[^„]e(c7) =mm{[Ck[.k>0,3xeZ''-,Cn + xCCk and 6»_a;(T G [AnY}. 

(2.3) 

In words, given a configuration, the approximate hitting-time of the dis¬ 
torted pattern [An\^ is by definition the smallest volume of a fc-cube (“obser¬ 
vation window”) such that there is some translate of an n-cube, contained 
in the observation window, which “hits” the elCnl-ball (in the Hamming 
distance) around 

For e = 0 (exact matching time or occurrence time of a pattern), we 
obtained in [1] an exponential approximation for the law of under 

the hypotheses of nonuniform (/^-mixing and Gibbsianness of the random 
field. We recall here this mixing assumption. For m > 0, define 

(2.4) (^(m) =sup-^|Pr(F;Ail^A2) -Pr(^Ai)|, 

where the supremum is taken over all finite subsets Ai,A 2 of TY, with 
d{Ai,A 2 )>m [as usual, d{Ai,A 2 ) := ini{d{x,y): x G Ai,y G A 2 } and 
d{x,y) := ||x-y||oo = maxi<j<d|xi-yi|] and Eai with Y’i{Ea 2 ) > 0. 

Note that this ip{m) differs from the usual (/^-mixing function since we divide 
by the size of the dependence set of the event Ea^ • 
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Definition 2.2. A random field is nonuniformly exponentially yp-mixing 
if there exist constants Ci, C 2 > 0 such that 

(2.5) (p{m) < for all m > 0. 

A typical example of a Gibbs field satisfying this assumption is the 2d- 
Ising model above critical temperature. In general, it is satisfied in the so- 
called high-temperature regime of Dobrushin uniqueness. We refer the reader 
to [ 8 , 9] for more details on this and on Gibbs measures in general. 

An important property of Gibbs measures is the so-called “finite energy” 
property. This means that there is a continuous version of the conditional 
probability P(cro = such that 

(2.6) (5<P(cJo = 0|f72d\{o}) < (1 

where <5 £ (0, ^) is independent of a. This immediately implies the existence 
of At > 0 such that, for all V C and all 77 G D, 

(2.7) ^{{cr:av = rjv})<e~'^^^^. 

We will use the following estimate: 


Lemma 2.3. Under the assumption that ¥ is a Gibbs measure, there 
exist Sc > 0 and K = K{ec) > 0 such that, for any pattern An and any e < Sc, 

F{[AnY)<e-^^\ 


Proof. This is an immediate consequence of the estimate (2.7) and the 
estimate 





with /(e) J, 0 if e i 0 . □ 


Contrary to the situation for exact matching, we will need an assumption 
on the patterns in order to obtain an exponential law. This can be compared 
with the condition of not being “badly self-repeating” needed to obtain the 
exponential law for return times in [ 1 ]. As we shall see, being a “good” 
pattern is a typical property. 


Definition 2.4. Given 0 < a < 1,0 < e < 1, we say that an n-pattern 
An is {e,a)-good if the set [AnY H 9x[AnY is empty for all x such that 
|x| < an. The set of all (e, a)-good patterns is denoted by Gn{£, «)• By abuse 
of notation, we use the same symbol for the set of configurations uj such that 
u;c„ is (e,a)-good. 
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For e = 0 and a < 1/2, coincides with the set of nonbadly self¬ 

repeating patterns in [1], Definition 5.1. 

We shall need a result by Chi [4] on the rate distortion function. We recall 
briefly the definition of the rate distortion function and refer the reader to [3] 
for more information and background and to [6] for a discussion on lossy data 
compression. Given a stationary and ergodic measnre Q and a stationary and 
ergodic Gibbs measure P, the rate distortion function 7^(Q,P, e) is dehned 
as follows: 


( 2 . 8 ) 

(2.9) 


7^(Q, P, e) = Jii^ Tin (Q, P, e ), 
7^„(Q,P,e) =inf-^iif(JI„ || x P„), 

Jn \^n\ 


where the infimum taken over all joint distributions on {0,1}"''^ x {o,ir‘* 
such that the {0,1}"' -marginal of is Qn and 


A(C„,cu,q-) 

l^nl 


dSn{‘^,o') < e. 


H{^n II Qn X Pn) IS the relative entropy between and x P„. 

We have the following key result which follows from [4] and [6], Theorem 
25. 


Proposition 2.5. Let Q be a stationary and ergodic measure and P be 
a stationary and ergodic Gibbs measure. Then 

(2.10) 7^(Q,P,e) = — lim logP([a;c'„]^) Q-almost-surely. 

►oo \Cn\ 

Moreover, TZ is a convex (and, hence, continuous) function of s and is 
nonzero in some interval [0,eo)- 

The property (2.10) is called the generalized asymptotic equipartition 
property in [6]. Thronghont we will simply write TZ{e) instead of 7^(Q,P,e). 
We can now state our main result. 


Theorem 2.6. Suppose that P is a nonuniformly exponentially ip- 
mixing Gibbs measure and Q is a stationary and ergodic Gibbs measure. 
Assume that the rate distortion function (2.8) is strictly positive in [0,eo)- 
Then for all a G (0,1) and e > 0 small enough, namely, 

e 

- < £0, 
a 

there exist Ai, A 2 ,C', c G (0,oo), such that and for every t > 0, n>l, and 
Q-almost all u) with ujCn ^ 0n{£,o), the following estimate holds: 


^ A„P([a;cJ^) 


— e 


—t 


<Ce- 


( 2 . 11 ) 
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where A„ = K{u)c„) is such that 

(2.12) Ai<A„<A2. 

Dependence of the parameters in Theorem 2.6 on e and a will be discussed 
after the proof; see Remark 4.1. 

Let us briefly comment on the difference between Theorem 2.6 and the 
one obtained in [1] for exact matching, that is, the case corresponding to 
e = 0. First of all, we need to restrict ourselves to special patterns, that is, 
(e,a)-good patterns, whereas in [1] result applies to all patterns. Second, 
the error term that we obtain in [1] is of the form Ce~^^'P{[uJcJ\Y■, where 
yO > 0. Of course, the factor is uniformly exponentially small for 

Gibbs measures. This is no longer true for lP([wc„]^) if e is too large. This 
is precisely why we need Lemma 2.3. Third, a crucial step in the proof 
of Theorem 2.6, which differs slightly from that in [1] for the case e = 0, 
involves Proposition 2.5. This explains why we need to restrict to typical 
configurations in the sense of this result. 

Let us close this set of remarks by noticing that Q has to be a stationary 
and ergodic measure, but not necessarily Gibbsian. But for later use of 
Theorem 2.6, we shall also need the latter assumption, so we already impose 
it to state the theorem. 

The following proposition shows that “wCn G that is, that a pat¬ 

tern being (e,a)-good, is a typical property. 

Proposition 2.7. Let Q be a stationary Gibbs measure. Then, if a < 
1/2 and e > 0 is small enough, there exists v > 0 such that, for all n > 1, 

(2.13) Q{gn{e,a))>l-e-^^\ 

It turns out that if the random field has a nontrivial dependence structure, 
then the restriction to (e, a)-good patterns is unavoidable. However, in the 
case of a random field distributed according to a Bernoulli measure, the 
exponential law (2.11) holds for all approximate patterns. This is expressed 
by the following theorem. 

Theorem 2 .8. If P is the Bernoulli measure with P(cto = 1) = 1/2; 
then /2.11/ holds without the restriction that is (e,a)-good. 

3. Approximate waiting-time fluctuations. The purpose of this section 
is to derive two consequences of Theorem 2.6 and Proposition 2.7. The first 
one implies a strong law of large numbers for the approximate waiting-time. 
It was previously derived in [6] directly using the mixing property (2.5). The 
second one concerns large deviations of the approximate waiting-time and it 
is a new result. Given two configurations uj,a, the approximate waiting-time 
is W^(u;,fj) :=T[^^^]e(f7). 
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Proposition 3.1. Under the assumptions of Theorem 2.6 and Propo¬ 
sition 2.1 , there exists 70 > 0 such that, for all 7 > 70, 


7logre < log(W^(a;,CT)P([a;c„]^)) < logOogn"^) 


(3.1) 


Q X V-eventually almost surely. In particular, 



With Proposition 3.1, we recover the results of Theorems 26 and 27 in [ 6 ]. 
However, there is a substantial difference in conditions on random fields. We 
have to restrict ourselves to measures Q which are stationary and ergodic 
Gibbs measures, while in [ 6 ] Q is only assumed to be stationary and ergodic. 
On the other hand, we permit P to be Gibbsian, while in [ 6 ] P must be a 
Bernoulli measure. The reason for our assumptions on Q is that Proposition 
2.7 is valid for Gibbs measures. We do not know if it can be extended to 
more general situations. 

Let us also remark that, by a basic result in Probability Theory, this strong 
approximation implies that if a central limit theorem holds for —\/\Cn\x 
logP([(Jc„]^)) then it holds also for {l/\Cn\) log Unfortunately, the 

former seems to be a difficult issue, except in the i.i.d. case. We refer the 
reader to [ 6 ] for some results in that direction. 

We have the following (partial) large deviation results. We first need the 
following lemma showing that we can define the generalized conditional q- 
order Renyi entropy for Gibbs random fields. This was first done in [11] for 
(a-mixing) stochastic processes (d = 1 ) with the difference that here we need 
to condition on (e, a)-good patterns and use the Gibbs property instead of 
mixing. 

Lemma 3.2. Let Q,P he stationary Gibbs measures and assume that 
a < 1/2 and 0 < e < 1. Then, for all g G M, the following function is well 
defined: 



(3.2) 


fQg„(£,a) denotes the measure Q conditioned on the set of good patterns.) 

The generalized g-order Renyi entropy should be defined as —£e{—q)/q. 
We now have the following theorem. By bn, we mean that 


max{a„/ 6 „, 6 „/a„} 


is bounded from above. 
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Theorem 3.3. Let ¥ be a nonuniformly exponentially ip-mixing Gibbs 
measure and Q a stationary and ergodic Gibbs measure. If £ > 0 is small 
enough, then for any uq < a <1/2, we have 


(3.3) 


and 


^ j¥{[u;cSy‘^ if 9 > -1 


(3.4) 


ll{Wyio,a))UQg^^,,^)iu;)d¥ia) 

^ J P([a;c„]^)dQg„(£,a)(w) if g < -1. 


In particular, 

JJ (W^(a;,a))''dQg^(e,o)(a;)dP(cj) 

(3.5) 

^(£e{-Q), lfg>-l, 

14(1), ifg<-l. 


It follows from this theorem that Theorem 4.5.20 in [7] applies to {l/ICyil x 
log W^(ry, cr)}„. However, to obtain a full large deviation principle, we would 
need to know under which conditions the function q Seil) is, for instance, 
differentiable for q > —1 (and for e small enough). If that were the case, 
we would have a large deviation principle with a rate function given by the 
Legendre transform of 8^{—q). 


4. Proofs. 


4.1. Proof of Theorem 2.6. The proof of Theorem 2.6 is quite similar to 
the proof of exponential law in [1]. We describe briefly the common approach 
and indicate the differences. We also provide the necessary modifications of 
the proof. 

It is well known that a random variable Z has an exponential distribution 
if and only if 


or, equivalently. 


¥{Z > s + t\Z >t)= ¥{Z > s) 


¥{Z > s + t) = ¥{Z > s)¥{Z > t). 
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The basic ingredient of the proof in [1] was Lemma 4.4 (“Iteration Lemma”). 
This result establishes that, for a pattern and any finite number of cubes 
Ci C lA, i = 1,..., fc, with equal volumes 

we have 

(4.1) does not occur in wP(A„ does not occur in Ci)^. 


In [1] we also observed that the Iteration Lemma remains valid if a pattern 
An is replaced by the event with [An]^ not occuring in volume V if 

any pattern Bn G {An\^ does not occur in volume V. 

Another important ingredient of the proof is the control of the param¬ 
eter of the exponential distribution. Lemma 4.3 (“The parameter”) in [1] 
concerns nontriviality of the parameter A„, that is, the fact that it is nei¬ 
ther null nor infinite. To prove Lemma 4.3, we established a uniform second 
moment estimate for the number of occurrences of a pattern An in a con¬ 
figuration a restricted to a box that has later to be taken of size 1/P([A„]). 
It is the proof of this second moment estimate that we have to modify com¬ 
pletely. In Remark 4.1 in [1], we noticed that if En € Ec„ are events such 
that P(Lln) < for some c > 0, and such that 


(4.2) 


lim sup 

n—)-oo 


E 

0<|3:|<n 


¥{En n e^En) 
nEn) 


< oo, 


then this implies, together with the mixing property (2.5) and the Gibbs 
property (2.7), that the desired uniform second moment estimate holds. In 
turn, this implies the nontriviality of the parameter (2.12) (Lemma 4.3 in 

[I])- 

Thus, we turn to prove (4.2) when the event En is where An is 

a good and typical pattern. We assume that patterns An are such that 
Pin = {< 7 }, with a chosen in the set with Q-measure one from Proposition 
2.5, and such that An is good in the sense of Definition 2.4. 

We have to show for patterns An G Qni^-, ot) with e/a <eo that there exists 
a finite number C{£,a) such that, for all n. 


(4.3) 


E 

0<|a:|<n 


F{[AnYn9^[AnY) 

P([^n]^) 


< C{£, a). 


First of all, since An G Qn{£,oi) (see Definition 2.4), the terms correspond¬ 
ing to X with |x| < an are equal to 0. Therefore, we have to estimate the 
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sum 

(4.4) 


E 

an<\x\<n 


n[Anrne^[AnY) 


Note that, for x with \x\ > an, the intersection iCn + x)r\Cn is not very 
large: 

KCn + x) nCnl < (1 - a)n'^- 


Note also that A{V,uj,An) denotes the number of differences between lo and 
An in the volume V, see (2.1). Then we can write 


(4.5) 


F{[AnY ne^lAnf) 

= P(u;: A{Cn, w, An) < en'^ fl A{Cn + x, ui, O^xAn) < eu'^) 


where, by O-xAn, we mean 9-xAn{y + x) = An{y), y £ Cn- For the sake of 
convenience, we simply write C for Cn and Cx for Cn + x in the course of 
this proof. We also introduce the short-hand notation 


Si=A{C\Cx,UJ,An), 


(4.6) 


52 = A{Cr\Cx,u:,An), 

53 = A{C n Cx,UJ, 0 -xAn), 

54 = A(C' \ Cx,UJ, 6 -xAn). 


With this notation what we have to estimate is 


(4.7) 


E 

an<\x\<n 


niAnYHOxiAnY) 

P([^n]^) 


= ^ P{S3 + Si<en'^\Si + S2<en<^). 

an<\x\<n 


The following estimate is a corollary of [4] and a basic property of a Gibbs 
measures: for any configuration 


(4.8) n{^:A{Vn,uj,a)<e\Vn\]\ Cv-) < exp(-|K| n{e) + c\dVn\). 

Indeed, the unconditioned statement is proved in [4], and conditioning can 
at most introduce a term of order exp(c|cd4|). 

We proceed as follows: 

F{{Si + 52) < en'^ n (53 + 54) < £n‘^) 

< P((5i + 52 ) < n 54 < en'^) 

< supP(54 < en'^|^zd\(G,.\c))P([^n]^) 


< expl —an TZ 


+ cn 


d-l 


P([A]^). 
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Therefore, 


E 

an<\x\<n 


F{[AnYneMn 


< n“exp( —an^TZ 



+ cn 



=: Cn{e,a). 


Taking into account that e/a < Eq, and, hence, TZ{e/a) > 0, we conclude 
that Cnis,a) ^ 0 as n —> oo, and, hence. 


C(e,a) = supC'„(e,a) 

n 

is finite. This completes the proof. 


Remark 4.1. The parameters of Theorem 2.6 depend on the choice of 
e and a. The most interesting is the dependence of Ai and A 2 . Lemma 4.3 
in [1] in fact shows that a uniform choice A 2 = 2 suffices. A more interesting 
question is whether we can give a uniform bound on Ai for a large set of e 
and a. The present modification of the second moment estimate, together 
with the rest of Lemma 4.3 in [1], which remains unchanged, gives that, 
for some c, dependent on e alone, the following choice of Ai = Ai(e,a) will 
suffice: 


c “t“ C (s, cr) 

The rate distortion function 7^ is a monotonically decreasing function. Hence, 
for a fixed e > 0, aTZ{^) is a monotonically increasing function of ce, and 
finally, C(e,a) is monotonically decreasing in a. Therefore, if e Keq, then 
for all a > ao ■= 0.99^, 

Ai(e,a) > Ai(e,ao) > 0. 

Therefore, for a fixed e > 0, we obtain a uniform (in a) bound on the pa¬ 
rameter Ai. 


4.2. Proof of Proposition 2.7. For e = 0, we know that most patterns 
are (0,Q;)-good for any a < 1. Indeed, it is proved in [1] (Lemma 5.3) that 
Q(^n(e,a)) > 1 — for some k' > 0. 

Let us now argue that for small e this is still the case. Suppose a < 1/2, 
that is, we are going to consider vectors x G such that |x| < ^. An element 
A of [Ar/f n 6x[AnY satisfies 

(4.9) ^ \A{y) - A{y-x)\< 2en'^. 

yec^nc 

(Recall that C = Cn and Cx = Cn + x.) This implies that there exists a set 
Vn^C and a disjoint translate 14 + z C C such that |I4| > (l/2)'^n'^ such 
that 9-zAv^+z matches with error fraction with Ay^] this can be made 
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as small as e for some > 0, for e sufficiently small uniformly in 
by Lemma 2.3. Therefore, we obtain that 

(4.10) Q(g(e,ao))>l-e-""' 

for all a < 1/2 and e small enough. 

4.3. Proof of Theorem 2.8. We consider the case d = 1 only, because 
the case d > 2 is completely analogous. Start with the particular pattern 
An = 0- ■ - 0 that we simply denote by On. The difficulty with this “bad pat¬ 
tern” comes from the fact that the second moment estimate does not apply, 
because (4.2) fails. Therefore, we have to prove by other means that there 
exists d > 0 such that, for all n G N, 

{4.11) 

which would imply the nontriviality of the parameter An. We will first show 
that there exists a sequence kn] oo such that 

(4.12) <5 < P(T[o„]e >kn)<l-d. 

It will then follow easily from the Bernoulli character of P that kn does not 
depend on the choice of the pattern, that is, (4.12) holds with the same kn 
for any pattern An- Then we can apply Theorem 2.6 for good patterns, and 
obtain kn = l/P([An]^) = 1/P([0n]^). We have the following identities: 

/ fcn ^ 

P(T[o^]e < fcn) = P mm ^ Wi < ne 

i=k 

\ 

inax (1 “ 2u;j) > (1 — 2e)n | 

i=k / 

= p(max(S'fc+n - Sk) > (1 - 2e)r?j , 

where Sn is the position of a simple random walk on Z (with 5o = 0) after n 
steps. By Theorem 7.23 in [12], together with the strong invariance principle 
([12], page 53), we have 

(4.14) rnax(£'fc+n - 5'fc) = alogfcn + &loglogfcn + c o(l) -h X, 

k=0 

where is a random variable with a Gumbel distribution. Therefore, if we 
choose kn such that 

(4.15) (1 - 2s)n = alogkn + bloglogkn + c + o{l), 
then (4.12) holds. 

If we now choose any other pattern An, then, under P, Sn = 2X]r=o(4/2 “ 
ai — An{i)) is again distributed as a simple random walk, so we find the 
same kn, which completes the proof of the theorem. 
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4.4. Proof of Proposition 3.1. By using Theorem 2.6, we immediately 
get 

Q X P{(a;,o-):log(W=(w,a)P([u;c„]^)) >logt} 

= J dQ{uj) P{o-:log(T[^^^]e(cj)P([a;c„]^)) >logt} 

< ^ Q([An]). 

An£g^{e,a) 

Now we choose t = tn = log(n''') with 7 > 0 such that A 17 > 1. This makes 
the first term in the right-hand side summable in n. The last one equals 
the Q-measure of the complement of Qn{s,a), which is less than by 

Proposition 2.7. We thus get the upper bound in (3.1) by an application of 
the Borel-Cantelli lemma. 

Now we turn to prove the lower bound in (3.1). Proceeding as before, we 
get 

Q X P{(u;,u):log(W^(a;,a)P([u;c„]'')) <logt} 

< 1 _ ^ Q{[An]) 

A„eg^ {e,a) 

<A 2 t + Ce-^^" + e-^”". 

We have used Theorem 2.6 and Proposition 2.7. We now choose t = tn = n~'^, 
with 7 > 1 , to get a summable upper bound in n for the above probability. 
An application of Borel-Cantelli lemma gives the desired result and the 
proof of the proposition is complete. 

4.5. Proof of Lemma 3.2. We only consider the case q > 0 leaving the 
(very similar) proof for the case g < 0 to the reader. Let be the system 
of all rectangular boxes of the form 

d 

p = z'^n n [rufc, Ufc] with mk,nk€ Z, mk<nk. 

k=l 

Before proceeding, we have to extend Definition 2.4 somewhat. We will de¬ 
note by «) the set of good patterns supported onV gSu. We shall need 
Proposition 2.7, which remains valid if one replaces Gn{£,Oi) with Qv{£-,oi) 
and n by |P| in (2.13). 

We are going to prove that the function a: —> (—00, +00) defined as 

a{V) :=-log J P^([fJv]^)dQg^^^,(e,„)(a) 

satisfies the following approximate sub-additive property: 

a{V U V') < a{V) + aiV') + C \d{V U V')\ 
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for all V, V' e such that 1^ U 1^' G iSn and 1^ Pi 1^' = 0, where C is a con¬ 
stant (depending on q), and where dV denotes the boundary of V. Of course, 
\diy U V')\ is a surface order correction. If such a property holds [together 
with a{V -|- x) = a{V), for all x € 1/ G iSn which is obvious by station- 

arity of the measure], then a generalized sub-additive lemma, obtained as 
a combination of a lemma found in [8] and another one given in [7], will 
guarantee that 

aiCn) 


\Cn 


exists, as we wish. For all g G M, V,V' ^ such that V UV ^ Su and 
17 n F' = 0, we have the following: 

/ \ 9 


^‘^iWvuv'Y) = 

> gAi|a(yuy')l 


X] E®([^^yuy']) 

^\-^VUV'Y ^ 


Y. P([I^V^1)P(K'I) 


= ^‘^{[avuv'f) P''(kyuy']"), 

where Ki,K 2 are constants. The first inequality follows from the Gibbs prop¬ 
erty and the second one is a simple consequence of the Hamming distance 
property. To complete the proof, we again use the Gibbs property to get 

J P''(kyuy']^)rfQs^^^,(£,a)(f^) 

= X P''([^yuy']^)Qe^,^,^„(£,a)([‘^yuy']) 

X X P'^([‘^y]^)Qgv-u^/(£,a)([^^y]) 

X f P'^{[(Jv'Y)dQg^,{e,a){(^), 
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where is a constant. The second inequality is the consequence of Propo¬ 
sition 2.7 if \V U V'\ is large enough. The lemma is proved. 

4.6. Proof of Theorem 3.3. Since the proof of this theorem is very sim¬ 
ilar to that of Theorem 2.7 in [1], we only sketch it to indicate the little 
differences between them. 

The starting point is of course to write 

|y'(W^(a;,a))''dQg4,,„)M dF{a) = j J dF{a). 

Then we can mimic the proof of Theorem 2.7 in [1] by using Theorem 2.6 
and the analog of Lemma 4.3 in [1], which holds true when T[u)c ] is replaced 
by ]e, provided that ujCn b® (e, a)-good pattern (see the beginning 
of the proof of Theorem 2.6), and oj be Q-typical in the sense of Proposition 
2.5. Notice that we integrate with respect to the conditional measure Qg„(e,Q,) 
which takes care of these two properties. 

Acknowledgment. We thank Z. Chi for providing us with his preprint 
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